Stata

您所在的位置：网站首页 › stata sem model fit › Stata

Stata

2024-07-16 21:26| 来源: 网络整理| 查看: 265

Order Introduction to structural equation modeling (SEM)

Do you know what SEM is? (If you know what SEM is, read an overview of Stata’s SEM capabilities.)

SEM stands for structural equation modeling. SEM is

A notation for specifying structural equation models. A way of thinking about structural equation models. Methods for estimating the parameters of structural equation models.

Stata’s sem command implements linear structural equation models.

For those of you unfamiliar with SEM, it is worth your time to learn about it if you ever fit linear regressions, multivariate linear regressions, seemingly unrelated regressions, or simultaneous systems, or if you are interested in generalized method of moments (GMM).

As you may have figured out, SEM is based on the linear model. What it brings to the table is flexible specification—nearly anything can be allowed to be correlated or constrained to be uncorrelated—and unobserved (latent) variables which can be treated (almost) as if they were observed.

SEM fits the first and second moments of the distribution of observed variables—means, variances, and covariances—rather than fitting the observed values themselves. Both maximum likelihood and GMM methods are available; sem uses a weighting matrix corresponding to asymptotic distribution free estimation in the SEM literature.

You still think of the model in the same way as usual, but in a model like

yj = β0 + β1x1j + ... + βkxkj + ej

let’s now call ej the error. Reserve the word residual for the true residuals of the SEMs, which are the differences between the observed and predicted moments.

When SEM is used to fit models that can be fit by the other linear estimators, results are the same, asymptotically the same—by which we mean different in finite samples, and there is no theoretical reason to prefer one set of estimated results to the other—or the SEM results are asymptotically the same and the SEM results should be better in finite samples because of theoretical reasons.

Notation

Individual structural equation models are usually described using path diagrams, such as

This diagram is composed of

Boxes and circles with variable names written inside them. Boxes contain variables that are observed in the data. Circles contain variables that are unobserved, known as latent variables. Arrows, called paths, that connect some of the boxes and circles. When a path points from one variable to another, that means the first variable affects the second. More precisely, if s->d, that means to add βk to the linear equation for d. βk is called the path coefficient. Sometimes small numbers are written along the arrow connecting two variables. That means βk is constrained to be the value specified. When no number is written along the arrow, the corresponding coefficient is to be estimated from the data. Sometimes symbols are written along the path arrow to emphasize this, and sometimes not. The same path diagram used to describe the model can be used to display the results of estimation. In that case, estimated coefficients appear along the paths. Not shown above are curved, double-headed paths that are used to indicate covariances where they would not be otherwise assumed. Exogenous variables are assumed to be correlated.

Thus the above figure corresponds to the equations

x1 = α1 + β1X + e.x1 x2 = α2 + β2X + e.x2 x3 = α3 + β3X + e.x3 x4 = α4 + β4X + e.x4

There’s a third way of writing this model, namely

(x1

【本文地址】

Stata

Stata

今日新闻

推荐新闻